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Abstract — Agile recovery from link failures in autonomic 
communication networks is essential to increase robustness, 
accessibility, and reliability of data transmission. However, this 
must be done with the least amount of protection resources, while 
using simple management plane functionality. Recently, network 
coding has been proposed as a solution to provide agile and cost 
efficient network self-healing against link failures, in a manner 
that does not require data rerouting, packet retransmission, 
or failure localization, hence leading to simple control and 
management planes. To achieve this, separate paths have to be 
provisioned to carry encoded packets, hence requiring either the 
addition of extra links, or reserving some of the resources for 
this purpose. 

In this paper we introduce autonomic self-healing strategies 
for autonomic networks in order to protect against link failures. 
The strategies are based on network coding and reduced capacity, 
which is a technique that we call network protection codes (NPC). 
In these strategies, an autonomic network is able to provide 
self-healing from various network failures affecting network 
operation. The techniques improve service and enhance reliability 
of autonomic communication. 

Network protection codes are extended to provide self-healing 
from multiple link failures in autonomic networks. Although this 
leads to reducing the network capacity, the network capacity 
reduction is asymptotically small in most cases of practical 
interest. We provide implementation aspects of the proposed 
strategies. We present bounds and network protection code con- 
structions. Furthermore tables of the best known self-healing 
codes are presented. Finally, we study the construction of such 
codes over the binary field. The paper also develops an Integer 
Linear Program formulation to evaluate the cost of provisioning 
connections using the proposed strategies, and uses results from 
this formulation to show that it is more resource efficient from 
1+1 protection. 

Index Terms — Autonomic networks; network protection codes, 
self-healing, link failures, network coding, channel coding, and 
code constructions. 



I. Introduction 

Today's communication networks are becoming complex to 
the degree that the management of such networks has become 
a major task of network operation. Therefore, the use of 
network autonomy such that the management functionality 
and its complexity, is moved to within the network has 
become the preferred approach, hence giving rise to what 
is known as autonomic networks [19]. Autonomic networks 
are self-managed, and they are efficient, resilient, evolvable, 

This paper was presented in part at the IEEE Globecom 2008 Conference, 
New Orleans, LA, December 1-4, 2008 [2]. 



through self-protection, self-organizations, self-configurations, 
self-healing and self-optimizations (see for example [8], [10], 
[21] and the references therein). Therefore an autonomic 
network promotes the autonomy of operational networks with 
minimum human involvements. However, it is also important 
not to overload the management plane of autonomic networks 
to the degree that the management functionality consumes sig- 
nificant amount of computing and communication resources. 
This paper addresses the self-functionality in autonomic net- 
works, and introduces a technique to provide self-healing 
that results in simplifying the management plane, as well as 
the control plane. The technique uses reduced capacities and 
network coding. 

Network coding is a powerful tool that has been used to 
increase the throughput, capacity, and performance of com- 
munication networks [20], [23]. It offers benefits in terms 
of energy efficiency, additional security, and reduced delay. 
Network coding allows the intermediate nodes not only to 
forward packets using network scheduling algorithms, but 
also encode/decode them using algebraic primitive operations 
(see [1], [7], [20], [23] and references therein). 

One application of network coding that has been proposed 
recently is to provide network protection against link failures 
in overlay networks [12], [15]. This is achieved by transmitting 
combinations of data units from multiple connections on a 
backup path in a manner that enables each receiver node to 
recover a copy of the data transmitted on the working path in 
case the working path fails. This can result in recovery from 
failures without data rerouting, hence achieving agile protec- 
tion. Moreover, the sharing of network protection resources be- 
tween multiple connections through the transmission of linear 
combinations of data units results in efficient use of protection 
resources. This, however, requires the establishment of extra 
paths over which the combined data units are transmitted. 
Such paths may require the addition of links to the network 
under the Separate Capacity Provisioning strategy (SCP), or 
that paths be provisioned using existing links if using the 
Joint Capacity Provisioning strategy (JCP), hence reducing the 
network traffic carrying capacity. 

Certain networks can allow extra transmissions and the 
addition of bandwidth, but they do not allow the addition of 
new transmission lines. In this scenario, one needs to design 
efficient data recovery schemes. In this paper, we propose 
such an approach in which we use network coding to provide 
agile, and resource efficient protection against link failures, 
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and without adding extra paths. The approach is based on 
combining data units from a number of sources, and then 
transmitting the encoded data units using a small fraction of 
the bandwidth allocated to the connections, hence disposing of 
the requirement of having extra paths. In this scenario, once 
a path fails, the receiver can recover the lost packets easily 
from the neighbors by initiating simple queries. 

Previous solutions in network survivability approaches using 
network coding focused on providing backup paths to recover 
the data affected by the failures [12]— [14]. Such approaches 
include 1+N, and M+N protections. In 1+N protection, an 
extra secondary path is used to carry combinations of data 
units from N different connections, and is therefore used to 
protect N primary paths from any single link failure. The M+N 
is an extension of 1+N protection where M extra secondary 
paths are needed to protect multiple link failures. 

In this paper, we introduce autonomic self-healing and 
healing-protection network strategies based on network cod- 
ing and reduced capacity. In these strategies, an autonomic 
network is able to provide self-healing from various network 
failures. The techniques improve services and enhance relia- 
bility of autonomic communication. We define the concept of 
network protection codes similar to error-correcting codes that 
are widely used in channel coding [9], [16]. Such codes aim to 
provide better provisioning and data recovery mechanisms [2]. 

The new contributions in this paper are stated as follows: 

i) We introduce a self-healing strategy using network coding 
and a reduced capacity strategy instead of using dedicated 
paths. 

ii) We provide a new scheme to protect against a single link 
failure in autonomic networks. The scheme is extended 
to protect against multiple link failures. 

iii) We develop a theoretical foundation of protection codes, 
in which the receivers are able to recover data sent over 
t failed links out of n primary links. 

iv) The developed protection strategies are achieved over the 
binary field, hence the encoding and decoding operations 
are done using XOR operation. 

This paper is organized as follows. In Section ITT1 we briefly 
state the related work and previous solutions to the network 
protection problem against link failures. In Section [Til] we 
present the network model and problem definition. Sections HVl 
and [V] discuss single and multiple link failures and how to 
protect these link failures using reduced capacity and network 
coding. In Section |VT] we give analysis of the general case 
of t <C n link failures. Sections IVIII and IVIIII present code 
constructions and bounds on the network protection code 
parameters. In Section|lX]we present an integer linear program 
to find the optimal provisioning under the proposed scheme. 
Section IXl introduces some numerical results based on the ILP 
and a comparison between 1+1 protection and the proposed 
scheme. The paper is concluded in Section IXj 
Notations: We fix the notation throughout the paper. Let n, k, 
to, and t be the number of total connections, working paths, 
protection paths, and failures, respectively, where n = k + to 
and t < k. Let L, be a connection from a sender Si to a 



receiver r^. Let c; be the unit capacity of the connection L; 
if it carries plain data (data without coding). F2 is a finite 
field with two elements {0, 1}. An [n, k, d m i n ]2 is a network 
protection code defined over F2 that has n connections, k 
working paths, n — k = m protection paths, and recovers from 
t = d m i n — 1 failures, where d m i n is the minimum distance 
of the code. 

II. Related Work 

In this section we will state the related work in network 
protection strategies against link failures, and linear codes 
that are used for erasure channels. We define the concept of 
network protection codes similar to error-correcting codes that 
are widely used in erasure channel coding [9], [16]. 

A. Revolution Networks Using Network Coding 

Network coding is a powerful tool that has been used 
to increase the throughput, capacity, and performance of 
communication networks [20], [23]. Network coding assumes 
that the network nodes not only can forward incoming mes- 
sages/packets, but also can encode, decode them. It offers 
benefits in terms of energy efficiency, additional security, and 
reduced delay (see [1], [7], [20], [23] and references therein). 
Practical aspects of network coding have been investigated 
in [6], and bounds on the network coding capacity are inves- 
tigated in [3], [18]. 

B. Protection against Failures Using Network Coding 

In [12], the author introduced a 1+N protection model in 
optical mesh networks using network coding over p-cycles. 
The author suggested a model for protecting N connections 
from a set of sources to a set of receivers in a network with n 
connections, where only one connection might fail. Hence, the 
suggested model can protect against a single link failure in any 
arbitrary path connecting a source and destination. In [13], the 
author extended the previous model to protect against multiple 
link failures. It is shown that protecting against to failures, at 
least to p-cycles are needed. The idea was to derive to linearly 
independent equations to recover the data sent from to sources. 
In [14], the author extended the protection model in [12] and 
provided a GMPLS-based implementation of a link protection 
strategy that is a hybrid of 1+N and 1:N. It is claimed that the 
hybrid 1+N link protection provides protection at higher layers 
and with a speed that is comparable to the speed achieved by 
the physical layer implementations. In addition, it has less cost 
and much flexibility. 

In this paper, we provide a new technique for protecting a 
network against failures using protection codes and reduced 
capacity, and for the network to recover from such failures in 
an agile manner. The benefits of our approach are that: 

i) It allows receivers to recover the lost data without 
data rerouting, data retransmission or failure localization, 
hence simplifying the control and management planes. 

ii) It has reasonable computational complexity and does not 
require adding extra paths or reserving backup paths. 

iii) At any point in time, all n connection paths have full 
capacity except at one path in case of protecting against 
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a single link failure and m < n paths in case of protecting 
against t < m link failures, 
iv) The working and protection paths capacities are dis- 
tributed among each other for fairness. 

III. Network Model 

Let Q — (V, E) be a graph which represents the network 
topology. V is a set of network nodes and E is a set of edges. 
Let there be n unidirectional connections, and let S C V be 
the set of sources {si,...,s„} and R C V\S be the set of 
receiver nodes {r±,...,r n } of the n connections in Q. The 
case of S f) R ^ <f> can be easily incorporated in our model. 
Two nodes u and v in V are connected by an edge (u, v) in E 
if there is a direct connection between them. We assume that 
the sources are independent of each other, meaning they can 
only send messages and there is no correlation between them. 
For simplicity, we will assume that a path exists between Si 
and ri, and it is disjoint from the path between sj and rj, for 
3 ^ i- 

The network model Af can be described in the following 
assumptions. 

i) Let TV be a network with a set of sources 
S = {si, S2, ■ ■ ■ , s n } and a set of receivers 
R = {n,r 2 , ■ ■ • ,r„}, where SURcV. 

ii) Let L be a set of links L\, L2, . . . , L n such that there is a 
link Li if and only if there is a connection path between 
the sender Sj and receiver n, i.e., Lj corresponds to the 
path 

{(Si,W U ), (lVli,W2i), (W(x)i,n)}, (1) 

where 1 < i < n and (w(j-x)ii w ji) S E, for some 
integer A > 1. Hence we have \S\ = \R\ = \L\ = n. The 
n connection paths are pairwise link disjoint. 

iii) Every source sz sends a packet with its own ID se and 
data xi to the receiver n, so 

packet se = (ID se ,X£,S), (2) 

where 5 is the round number of the source packet 

packet Sl . 

iv) All packets belonging to the same round are sent in the 
same round slot. The senders will exchange the rule of 
sending plain and encoded data for fairness, as will be 
illustrated below. 

v) All links carry uni-directional data from sources to re- 
ceivers. 

vi) We consider the scenario where the cost of adding a 
new path is higher than just combining messages in an 
existing path, or there is not enough resources to provision 
dedicated paths in the network. 

We can define the unit capacity Cj of a link Li as follows. 
Definition 1: The unit capacity of a connecting path Li 
between Sj and r, is defined by 

{1, Li is an active working path; 
0, otherwise. 

What we mean by an active path is that the receiver is able to 
receive and process unencoded signals/packets throughout this 



Working 




Fig. 1. Network protection against a single path failure using reduced 
capacity and network coding. One path out of n primary paths carries encoded 
data. The black points represent various other relay nodes 



path. Hence, the protection path is assumed to be inactive. The 
total capacity of N is given by the summation of all active 
path capacities, divided by the number of paths. 

This means that each source Sj can send a maximum of one 
packet per unit time on a link Li. Assume that all links have 
the same capacity. One can also always assume that a source 
with a large rate can be divided into a set of sources, each of 
which has a unit link capacity. 

The following definition describes the working and protec- 
tion paths between two network switches as shown in Fig. (Q]). 

Definition 2: The working paths in a network with n con- 
nection paths carry traffic under normal operations. The data 
on these paths are sent without encoding. The Protection paths 
in our proposed scheme carry encoded data from other sources. 
A protection scheme ensures that data sent from the sources 
will reach the receivers in case of failure on the working paths. 

Our goal is to provide an agile and resource efficient self- 
healing method for n connections without adding extra paths. 
Unencoded data is sent over a path Li without adding extra 
paths, but by possibly reducing the source rates slightly. Linear 
combinations of data units are sent on these paths alternately, 
and by using the reduction in working path capacities. The 
linear combinations are used to recover from failures. 

Clearly, if all paths are active then the total capacity of all 
connections is n. 

In general, the total normalized capacity of the network for 
the active and failed paths is computed by 

1 ™ 

cv = -y>. (4) 

i=l 



IV. Protecting Networks Against A Single Link 
Failure 

In this section we study the problem of protecting a set of 
connections against a single link failure in a network with 
a set of sources S and a set of receivers R. This problem 
has been studied in [12], [13] by provisioning a path that is 
link disjoint from all connection paths, and passes through all 
sources and destinations. All source packets are encoded in one 
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single packet and transmitted over this path. The encoding is 
dynamic in the sense that packets are added and removed at 
each source and destination. 

Assume that every source s, has its own message X{. Also, 
source Sj forms the encoded data yj which is defined by 



in 



Xi 



© Xr t 



(5) 



where the sum is over the finite field F2 = {0, 1}. In this case, 
the symbol © is the XOR operation. 

Source s,, for i ^ j, sends a packet to the receivers r^, 
which is given by 



CD- 



©■ 
CD 



packet Si - (ID Sil Xi,S). 



(6) 



On the other hand, 

source Sj sends a packet that will carry the encoded data 
Uj to the receiver rj over the link Lj, 

packet Sj = (ID Sj ,yj,S). (7) 

Now we consider the case where there is a single failure on 
link Lfc. Therefore, we have two cases: 

i) If k = j, the link Lj has a failure, and the receiver rj does 
not need to query any other node since link Lj carries 
encoded data that is only used for protection. All other 
receiver nodes receive their data correctly on links which 
have not failed. 

ii) If k 7^ j, then the receiver needs to query the other 
(n — 1) nodes in order to recover the lost data Xk over 
the failed link L^. The reason is that Xk exists either at 
Tj, and it requires information of all other receivers. Xk 
can be recovered by adding all other n — 1 data units. 
The recovery is implemented by adding yj and all Xi for 
i 7^ j, and i ^ k. This follows from Equation (|5). 

This shows that only one single receiver needs to perform 
(n — 2) addition operations in order to recover its data if its 
link fails. In other words, all other receivers will receive the 
transmitted data from the senders of their own connections 
with a constant operation O(l). 

The following example illustrates the plain and encoded data 
transmitted from five senders to five receivers. 

Example 3: Let S and R be two sets of senders and 
receivers, respectively, in the network model Af. The following 
scheme explains the plain and encoded data sent in five 
consecutive rounds from the five senders to the five receivers. 



cycle 


1 


2 


3 


rounds 


1 


2 


3 


4 


5 






si — > ri 


2/1 


4 


i 




4 






s 2 — > r 2 


x 2 


!J2 


X 2 


•'2 


x 2 






S3 -> ^3 


x 3 


• r 3 


2/3 


• r :i 


x 3 






s 4 -» r 4 
S5 -»• r 5 


4 


4 


^1 


2/4 

r 4 


X4 
2/4 







(8) 



The encoded data yj, for 1 < j < 5, is sent as 



i=l 



i=j+i 



(9) 



Fig. 2. Network protection against a single link failure using reduced capacity 
and network coding. One connection out of n primary working paths carries 
encoded data, i.e. protection path. There are n — 1 active working paths carry 
plain data. 



We notice that every message has its own round. Hence the 
protection data is distributed among all paths for fairness. 

A. Network Protection Codes (NPC) for a Single Link Failure 

We can define the set of sources that will send encoded 
packets by using constraint matrices. We assume that there is 
a network protection code C C defined by the constraint 
matrix 



G= 



1 
1 





1 

1 

1 1 



(10) 



(n-l)Xfi 



Without loss of generality, in Matrix ( TTOb . for 1 < j < n—1, 
the column vector ( g\j g 2 j ■ ■ ■ 9( n -i)j ) T m Fj -1 
corresponds to (n-1) sources, say for example the sources 
81, s 2 , ■ ■ ■ , s„_i, that will send (update) their values to (n- 
1) receivers, say i.e., r 1; r 2 , . . . ,r„_i. Also, there exists one 
source that will send encoded data, e. g., source n in the 
above matrix. The row vector ( gn ga ... gi n ) in F 2 l 
determines the channels L\, L 2 , ■ ■ ■ , L n . 

The weight of a row in G is the number of nonzero 
elements. We define d m i n to be the minimum weight of a 
row in G. Put differently 



i{\gij 7^ 0, 1 < j < n\, 1 < i < n- 1} 



(11) 
2 



Hence, since every row in G has weight of two, d m i 

We can now define the network protection code that will 
protect a single path failure as follows: 

Definition 4: An [n, n— 1, 2] network protection code C is 
an— 1-dimensional subspace of the space F r 2 l defined by the 
generator systematic matrix G and is able to recover from a 
single network failure of an arbitrary path Li. 

This means that an [n,n — l, 2] code over F2 is a code that 
encodes (n—1) symbols into n symbols and detects (recovers 
from) a single path failure. We note that the network protection 
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codes NPC are also error correcting codes that can be used for 
erasure channels. The positions of errors (failures) are known. 

Remark 5: The number of failures that can be recovered 
by an NPC is equal to the minimum distance of the code 
minus one, i.e., t = d m i n — 1. Sometimes we refer to NPC 
by the number of failures t, otherwise they are defined by the 
minimum distance d m i n as shown in Table (ITTTb - 

In general we will assume that the code C defined by the 
generator matrix G is known for every source s, and every 
receiver fj. This means that every receiver will be able to 
recover the data Xi if the link Lj fails, provided that L L 
is active in the sense defined in Definition (Q~|i. Hence, the 
rows of the generator matrix G are the basis for the code 
C. We assume that the positions of the failures are known. 
Furthermore, every source node has a copy of the code C. 
Without loss of generality, the protection systematic matrix 
among all sources is given by: 





Li 


L 2 ■ 


L n -i 


L n 




X\ 


• 





Xi 


S2 





X2 ■ 





X2 


Sn-1 





• 


Xn—1 


X n -l 


total 




x 2 ■ 


Xn—1 


Un 



where y n is the protection value collected from every source 

Si that will be encoded at source s n , for all 1 < i < n — 1. 
Put differently, we have 

n-l 

Dn = ^ %i (13) 

i=l 

where the summation operation is defined by the XOR oper- 
ation. 

In a general scenario, the system operates in cycles, where 
each cycle consists of n rounds, such that at round 1 < j < n 
of a cycle we have 

n 

Vo = Xi (14) 

where the round number of a packet Xi is not shown for 
simplicity with the understanding that it is the first packet 
in the source's output queue. We assume that every source 
Si has a buffer that stores its value Xi and can also send the 
protection value j/j. Hence in the channel Lj, sj prepares a 
packet packet Sj that contains the value 

packet Sj = (ID Sj ,yj,S), (15) 

and sender Si for i ^ j will send its data Xi in a packet Si 
over the channel Li defined as follows 

packet Si = {ID Si ,Xi,8) 1 (16) 

In general each source will send (n — 1) packets containing 
plain data, and exactly one packet contain encoded data in all 
n rounds. The transmission will be repeated in cycles, hence 
every cycle has n rounds. 



Recovery from a single path failure is summarized by the 
next two lemmas. 

Lemma 6: Encoding the data from sources >S\{sj} at a 
source Sj in the network Af is enough to protect against a 
single path failure. 

Lemma 7: The total number of encoding operations needed 
to recover from a single link failure in a network Af with 
n sources is given by (n — 2) and the total number of 
transmissions is n. 

The previous lemma guarantees the recovery from a single 
arbitrary link failure. 

Lemma 8: In the network model Af, through out each 
cycle, the average network capacity of protecting against a 
single link failure using reduced capacity and network coding 
is given by (n — l)/n. 

Proof: i) We know that every source Si that sends the 
data Xi over a working path Li has capacity q = 1. ii) Also, 
the source Sj sends the encoded data yj at different slots, has 
an inactive capacity, iii) The source Sj is not fixed among all 
nodes S, however, it is rotated periodically over all sources for 
fairness. On average one source of the n nodes will reduce its 
capacity. This shows the capacity of Af as stated. ■ 

V. Protecting Networks Against Multiple Link 
Failures 

In the previous section we introduced a strategy for self- 
healing from single link failure for autonomic networks. 

However, it was shown in [17] through an experimental 
study that about %30 of the failures of the Sprint backbone 
network are multiple link failures. Hence, one needs to design 
a general strategy against multiple link failures for the purpose 
of self-healing. 

In this section we will generalize the above strategy to 
protect against t path failures using network protection codes 
(NPC) and the reduced capacity. We have the following 
assumptions about the channel model: 

i) We assume that any t arbitrary paths may fail and they 
may or may not be correlated. 

ii) Locations of the failures are known, but they are arbitrary 
among n connections. 

iii) In order to protect n working paths, k connection must 
carry plain data, and m = n — k connections must carry 
encoded data. 

iv) We do not add extra protection paths, and every source 
node is able to encode the incoming packets indepen- 
dently. 

v) We consider the encoding and decoding operations are 
performed over F2. 

In Sections [Villi and I VIII we will show the connection between 
error correcting codes that are used for erasure channels and 
the proposed network protection codes [9], [16]. 

Assume that the notations in the previous sections hold. 
Let us assume a network model Af with t > 1 path failures. 
One can define a protection code C which protects n links 
as shown in the systematic matrix G in dTTb . In general, 
the systematic matrix G defines the source nodes that will 
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send encoded messages and source nodes that will send only 
plain message without encoding. In order to protect n working 
paths, k connection must carry plain data, and m = n — k 
connections must carry encoded data. 

The generator matrix of the NPC for multiple link failures 
is given by: 



G 



Pu 
P21 



Plr. 
P2r, 



... 1 I Pkl . . ■ Pkr, 

identity matrix k x k Submatrix p kxm 

where p tj e F 2 

The matrix G can be rewritten as 
G=\l h I P 



(17) 



(18) 



where P is the sub-matrix that defines the redundant data 
Y^i—iPij to t> e sent to a set °f sources for the purpose of 
self-healing from link failures. Based on the above matrix, 
every source Si sends its own message Xi to the receiver n 
via the link Li. In addition m links out of the n links will carry 
encoded data. Let d m i n be the minimum distance (minimum 
weight) of a nonzero vector in the matrix G. 

Definition 9: An [n, k, d m i n ]2 network protection code C 
is a fc-dimensional subspace of the space F?? that is able to 
recover from all network failures up to t = d m i n — 1. 

In general the network protection code (NPC), which pro- 
tects against multiple path failures, can be defined by a 
generator matrix G known for every sender and receiver. Also, 
there exists a parity check matrix H corresponds to G such 
that GH T = 0. We will restrict ourselves in this work for 
NPC that are generated by a given generator matrix G in the 
systematic. In addition, we will assume that the protection 
codes are defined by systematic matrices defined over F2 [9], 
[16]. An [n,k,t]2 NPC code is also an [n,k,d m in\2, where 

t duiin 1- 

Without loss of generality, at one particular round and cycle, 
the protection matrix (scheme) among all sources is given by 





Li 


L 2 ■ 


■ L k 


Lk+i 


Lk+2 


L n 


Sl 




• 


• 


P\\X\ 


P12X1 ■ 


■ ■ P\mX\ 


S2 





X2 ■ 


• 


P21X2 


P22X2 ■ 


■ ■ P2mX2 


Sfc 





• 


• x k 


PklXk 


Pk2Xk ■ 


• • PkmXk 


T 


Xl 


x 2 ■ 


■ x k 


yi 


V2 


Vm 



(19) 



We ensure that k = n — m paths L\, L2, ■ ■ ■ , L). have full 
capacity and they carry the plain data xi,X2, ■ ■ ■ ,Xk- Also, 
all other m paths have inactive capacity, in which they carry 
the encoded data yr,y%, ■ ■ ■ ,y m . In addition, the m links are 
not fixed, and they are chosen alternatively between the n 
connections. 



A. Encoding and Recovery Operations 

We shall illustrate how the encoding and recovery operations 
are achieved at the sources and receivers, respectively. 
Encoding Process. The network encoding process at the set 
of senders are performed in a similar manner as in Section [IV] 
Every source Si has a copy of the systematic matrix G and it 
will prepare a packet along with its ID in two different cases. 
First, if the source Sj will send only its own data xi with a 
full link capacity, then 



packet Si = (ID Si ,Xi,5). 



(20) 



Second, if § is the set of sources sending encoded messages, 
then 



packet Sj = (ID Sj , 2J PijXe,S), 



(21) 



where p^ £ F 2 . 

The transmissions are sent in rounds. Therefore, the senders 
will alternate the role of sending plain and encoded data for 
fairness. 

Recovery Process. The recovery process is done as follows. 
Assume t failures occur, then a system of linearly independent 
equations of t variables (corresponding to the data lost due 
to the failed paths) can be solved. The packet Si arrives at a 
receiver 7^ with an associated round number, 5. The receiver 
Ti at time slot n will detect the signal in the link Li. If the 
link Li fails, then will send a query to other receivers in 
R\{ri} asking for their received data. Assume there are t path 
failures. Then we have three cases: 

1) All t link failures have occurred in links 
that carry encoded packets, i.e., packet s . = 

(ID Sj ,J2t=i, Se 0Plj%i,S)- In this ca se no recovery 
operations are needed. 

2) All t link failures have occurred in links that do not 
carry encoded packets, i.e., packet Si = (ID Si ,Xi,S). In 
this case, one receiver that carries encoded packets, e.g., 
Tj, can send n — m — 1 queries to the other receivers 
with active links asking for their received data. After this 
process, the receiver rj is able to decode all messages 
and will send individual messages to all receivers with 
link failures to pass their correct data. 

3) All t link failures have occurred in arbitrary links. This 
case is a combination of the previous two cases and the 
recovery process is done in a similar way. Only the lost 
data on the working paths need to be recovered. 

The proposed network protection scheme using distributed 
capacity and coding is able to recover up to t < d m i n — 1 
link failures (as defined in Definition ©) among n paths and 
it has the following advantages: 

i) k = n — m links have full capacity and their sender nodes 
have the same transmission rate. 

ii) The m links that carry encoded data are dynamic (dis- 
tributed) among all n links. So, no single link Li will 
always suffer from reduced capacity. 
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iii) The encoding process is simple once every sender Sj 
knows the NPC. 

iv) The recovery from link failures is done in a dynamic and 
simple way. Only one receiver node needs to perform the 
decoding process and it passes the data to other receivers 
that have link failures. 



VI. Capacity Analysis 

We shall provide theoretical analysis regarding our network 
protection codes. One can easily compute the number of 
paths needed to carry encoded messages to protect against t 
link failures, and compute the average network capacity. The 
main idea behind NPC is to simplify the encoding operations 
at the sources and the recovery operations at the receivers. 
The following lemma demonstrates the average normalized 
capacity of the proposed network model Af where r failures 
occur. 

Lemma 10: Let C be a network protection code with pa- 
rameters [n, n—m, d m i n ] over F2. Let n and m be the number 
of sources (receivers) and number of connections carrying 
encoded packets, respectively, the average normalized capacity 
of the network Af is given by 



(n — m)/n. 



(22) 



Proof: At one particular round, we have m protection 
paths that carry encoded data. Hence there are n — m working 
paths that carry plain data. The result is a direct consequence 
by applying the normalized capacity definition. ■ 

Remark 11: In the network protection model Af, in order to 
protect t network disjoint link failures, the minimum distance 
dmin of the protection code must be at least t + 1. 

The previous remark ensures that the maximum number of 
failures that can be recovered is d m i n — L where d m i n is the 
minimum distance of the network protection code. For sim- 
plicity, we denote a NPC defined over F2 by [n, n — m, d m in\2 
unless stated otherwise. 

For example one can use the Hamming codes with param- 
eters [2 r — 1, 2 r — r — 1, 3]2 to recover from two failures. One 
can also puncture or extend these codes to reach the required 
length, i.e., number of connection, see [9] for deriving new 
codes from known codes by puncturing, extending, shortening 
those codes. [7, 4, 3)2, [15, 11,3] 2 , and [63, 57,3]2 are exam- 
ples of Hamming codes that protect against two link failures. 
The protection code [15, 11, 3] has 15 connections among them 
are 11 working paths and 4 protection paths, in addition the 
minimum distance is 3 and the code protects 2 link failures. 

Another example is the BCH codes with arbitrary design 
distance, i.e., [n, k, d m i n > <5] 2 ■ It is well known that the 
minimum distance of a BCH code is greater than or equal to 
its designed distance. References [15, 11,3]2, [31,26, 3] 2 and 
[63, 56, 3] 2 are examples of BCH codes that protect up to two 
link failures. Also, [15,8,5] 2 , [31,21,5] 2 and [48,36,5] 2 are 
examples of BCH codes against four link failures [9], [16]. 
In the next section we will include tables of the best known 
network protection codes. 



VII. Code Constructions and Bounds 

Assume we have n established connections in the network 
model Af. The goal is to design a good protection code that 
protects t failures. What we mean by a good protection code 
is that for given number of connections n and failures t, it has 
large number of working paths. Hence the protection code has 
a high performance. In addition, we establish bounds on the 
network protection code parameters in the next section. 

One way to achieve our goal is to design codes with 
arbitrary minimum distances. The reader can consult any 
introductory coding theory book, for example [9], [16]. In this 
case a BCH code with designed distance d and length n can 
be used to deploy this goal. 

We shall quickly review the essential construction of non- 
primitive narrow-sense BCH codes that will be used in the next 
section. Let q be a prime power, and n, a and d be positive 
integers such that gcd(q, n) = 1, and 2 < d < n. Furthermore, 
/i is the multiplicative order of q modulo n. Let a be a 
primitive element in F gt i . A nonprimitive narrow-sense BCH 
code C of designed distance d and length gLw 2 J < n < q^ — 1 
over F q is a cyclic code with a generator monic polynomial 
g(x) that has a, a 2 , 



,a d 1 as zeros, 



d-l 



g(x) = l[(x-a*). 



(23) 



Thus, c is a codeword in C if and only if c(a) = c{a 2 ) 

. . c(a d ^ 1 ) = 0. The parity check matrix of this code can 
be defined as 



Hbch = 



a 



,4-1 



a" 

2(n- 



v 2(d-l) 



a 



(d-l)(n-l) 



(24) 



If the minimum distance of this code is d m ;„ > d, then 
the code can recover up to d m i n — 1 failures. In this case the 
number of connections that will carry plain data is given by: 



k <n- n\(d-l)(l- 1/q)] . 



(25) 



But this is an upper bound in the dimension of the NPC, 
aka, the number of working connections that carry plain data. 
Therefore, we seek a result to determine the exact dimension. 
Fortunately, this can be obtained when the designed distance 
of BCH codes are bounded. The following Theorem enables 
one to determine the dimension in closed form for BCH code 
of small designed distance. 

Theorem 12: Let q be a prime power and gcd(n, q) = 1, 
with q^ = 1 mod n.Then a narrow-sense BCH code of length 



LM/2J 



< n < q^ — 1 over F g with designed distance d in the 



q 

range 2 < d < rf max = mm{ [nq 
dimension of 



fM/21 



- l)J,n}, has 



k = n-p\(d-l)(l-l/q)]. 



(26) 



Proof: See [4, Theorem 10]. ■ 
For small designed distance d we can exactly compute 
the minimum distance of the BCH code, see Tables (JJ), ( HI] ), 
and (flTTb - Consequently, determine the dimension of the pro- 
tection code. This helps us to compute the number of failures 
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TABLE I 

BEST KNOWN network protection codes AGAINST SINGLE AND DOUBLE 
LINK FAILURES 



11 


m 


code 


type 


n 

1 


i 
j 


I 7 ) 4 ; <*\2 


Hamming code 


1U 


A 




Lineal' code 


1 j 


A 


it; ii qi 


Hamming code 


19 


7 


19 12 3 o 


Extension construction 


23 


8 


[23,15,3] 2 


Extension construction 


25 


5 


[25,20,3] 2 


Linear code 


31 


5 


[31,26,3] 2 


Hamming code 


39 


8 


[39,31,3] 2 


Extension construction 


47 


9 


[47, 38, 3] 2 


Extension construction 


63 


6 


[63,57,3] 2 


Hamming code 


71 


8 


[71, 63, 3] 2 


Matrix construction 


79 


9 


[79,70,3] 2 


Extension construction 


95 


10 


[95,85,3] 2 


Extension construction 


127 


7 


[127, 120, 3] 2 


Hamming code 



TABLE II 

BEST KNOWN network protection codes AGAINST UP TO FOUR LINK 
FAILURES. SUCH CODES CAN BE PUNCTURED, EXTENDED, OR S HORTE NED 
TO OBTAIN THE REQUIRED LENGTH AS SHOWN IN SECTIOn IVHII 



n 


m 


code 


type 


15 




7 
1 


[10, o, OJ2 


Hamming code 




19 




8 


[19,11,5] 2 


Lengthening Hamming-Preparata 


code 


20 




11 


[20,9,5] 2 


Lengthening Hamming-Preparata 


code 


23 




9 


[23,14,5] 2 


Linear code 




31 




10 


[31,21,5] 2 


BCH code 




33 




10 


[33,23,5] 2 


Linear code 




35 




13 


[35,22,5] 2 


Shorting Preparata code 




63 




11 


[63,52, 5] 2 


Preparata code 




70 




12 


[70, 58, 5] 2 


Lengthening Hamming-Preparata 


code 


81 




13 


[81,68,5] 2 


Linear code 




128 




14 


[128, 114, 5] 2 


BCH code 




135 




18 


[135, 117, 5] 2 


Shorting Preparata code 





that the network protection code can recover. In practical cases, 
the number of failures t is small in comparison to the number 
of connections n that makes it easy to exactly compute the 
parameters of the network protection codes. Theorem ( fT2b 
made it explicit straightforward to derive the exact parameters 
of NPC based on BCH codes. 

We shall give many families of NPC codes derived from 
BCH codes over F 2 . One final thing is that one can also start 
by a code for a given length n, and will be able to puncture, 
shorten, or extend this code, see [9, Chapter 1.]. This will 
dramatically change the number of working and protection 
paths and failures which the code can recover. 



A. Bounds on the Code Parameters 

Bounds on the code parameters are needed to measure 
its performance and error recovery and detection capabilities. 
For a given code parameters length n and dimension k, we 
establish a bound on the minimum distance of the protection 
codes derived in the previous section. 

The most well-known upper bounds on error-correcting 
codes over symmetric and erasure channels are the Single- 
ton and Hamming bounds [9], [16]. The Singleton bound 
establishes the relationship between the length, dimension, 
and minimum distance of the code parameters, i.e. n, k, and 
dmin- However, it does not specify the connection between 
code parameters and the alphabets size q. The packing bound, 
known as Hamming bound, takes in consideration the codes 
parameters n,k,d m i n along with q. 

We can also state upper bounds on the network protection 
codes [9], [16]. The Singleton bound on the network protection 
code parameters are stated as follows. Let t be the number of 
failure that the code can protect. 

t<n-k (27) 

The equality in this bound will hold if the size of the used 
finite field is greater than n — t. 

One can also state the Hamming bound in the network code 



parameters as follows. 

L(d min -l)/2j 

£ uX*- 1 ) ^ n ~ fc (28) 

For the binary Hamming bound of m protection paths 




We have the following lemma on the minimum number of 
protection paths of network protection code parameters. 
Lemma 13: 

L*/ 2 J / \ j 

m > max (d min - l,\og q ( ^ (")( < ?- 1 ))} ( 30 ) 

i=o ^ ' 

Proof: The proof is a direct consequence from the Single- 
ton and Hamming bounds. Applying Equations ( l27b and ( l28l ) 
gives the result. ■ 

VIII. Tables of Best Known Protection Codes 

In this section we investigate which codes are suitable for 
network self-healing against link failures. We will present 
several network protection codes with given generator matrices 
and exact parameters. The proposed codes are not necessarily 
optimal, i.e. they do not saturate the Singleton bound. The 
classical Singleton bound is given by 

k<n-d m i n + l (31) 

This bound shows that the number of protection paths must 
be at least d m in — 1> i- e -> m _! d m - m — 1. The equality of this 
inequality occurs in case of a single path failure. 

We notice that all senders do not participate in the encoding 
vectors. This means that the proposed codes are suitable for 
the general protection case where a set of working paths 
is protected by a protection path. This will reduce our pro- 
posed codes to be also used for network protection using p- 
cycle [12], [14]. 

The codes shown in Table © are used to protect against 
single and double link failures using their symmetric generator 
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TABLE III 

Families of BCH codes that can be used as network protection codes 

AGAINST LINK FAILURES . 







RTH Code 


I J 


A 

4 


[15, 11.3] 


1 c 
1 J 


n 

I 


[15,8,4] 


15 


8 


[15,7,5] 


31 


5 


[31,26,3] 


31 


5 


[31,26,3] 


31 


10 


[31,21,5] 


31 


15 


[31,16,7] 


31 


10 


[31,11,11] 


31 


25 


[31,6, 15] 


127 


14 


[127, 113,5 




127 


49 


[127, 78,15 




127 


21 


[127, 106,7 




127 


50 


[127, 77, 27 





matrices. Also, the codes in Table ((I]). Table ( Hill ) presents the 
best known BCH codes for arbitrary minimum distance over 
F 2 . 

Given a NPC with parameters [n, k, d m i n ], one can possibly 
obtain a new NPC by shortening, extending, or puncturing 
this code. If there is an NPC C with parameters [n, k, d m in[2, 
then by i) shortening C yields a code with parameters [n — 
1, k— 1, d m i n ]2, ii) puncturing C yields a code with parameters 
[n — l,k,d m i n — 1]2, iii) appending C yields a code with 
parameters [n + 1, k, d m i n + 1]2, iv) extending C yields a code 
with parameters [n + 1, k + 1, dmin]2- 

For example, if there is a BCH Hamming code with 
parameters [15, 11, 3]]2, then there must be codes with param- 
eters [14, 10, 3] 2 (by shortening), [14, 11,2]2 (by puncturing), 
[16, 11,4]2 (by appending), [16, 12,3] 2 (by extending). The 
interested readers might consult textbooks in classical coding 
theory for further propagation rules [9], [16]. 

A. Illustrative Examples 

Example 14: Consider a BCH code C with parameters 
[15, 11, 3] 2 that has designed distance 3 and generator matrix 
G given by: 



1 
































1 


1 











1 
































1 


1 











1 
































1 


1 











1 























1 


1 





1 














1 




















1 





1 




















1 




















1 





1 




















1 














1 


1 


1 


























1 














1 


1 


1 


























1 








1 


1 


1 


1 





























1 





1 





1 


1 
































1 


1 








1 



(32) 



The code C over F2 can be used to recover from two link 
failures since its minimum distance is 3. One can puncture, 
shorten, or extend this code to obtain the required code length, 
which determines the total number of disjoint connections. 
In this example we have 15 connections, and 11 primary 
working paths. Furthermore, the links L12, L13, L14, L15 will 



carry encoded data. The matrix G presents the construction of 
NPC, and the senders that will send encoded and plain data. 

Example 15: The code C has parameters [15, 8, 4] 2 and 
generator matrix G given by: 



1 
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1 
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1 
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1 














1 
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1 
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1 


1 
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1 


1 


1 




















1 





1 


1 


1 








1 


1 























1 


1 





1 











1 



(33) 



This means that all senders si,...,sg will send plain data 
over the working connection Li, . . . , Lg. Also, the senders 
sg, . . . , S15 will send encoded data over the protection paths 
Lg, . . . , L15. In this encoding scheme, the connection Lg will 
carry encoded data from si, s 5 , S7 and sg. 



IX. ILP Formulation 

The problem of finding link disjoint paths between pairs of 
nodes in a graph is known to be an NP-complete problem [22]. 
Hence, even finding the working paths in this problem is hard. 
We therefore introduce an Integer Linear Program (ILP) for 
solving the reduced capacity network coding-based protection 
problem introduced in this paper. 

The purpose of the ILP is to find a feasible provisioning for 
groups of connections, such that: 

• The paths used by a group of connections protected 
together are mutually link disjoint. 

• There is a circuit, S, which connects the sources of all 
connections protected together, and this circuit is link 
disjoint from the working paths. The S circuit is used 
to exchange source data units in order to form the linear 
combination of data units to be sent on the path used for 
that purpose. 

• There is a circuit, R, which connects the receivers of 
all connections protected together, and this circuit is link 
disjoint from the working paths. The R circuit is used 
by the receivers to recover from lost data units due to a 
failure. 

• The total number of links used by the working paths, the 
S circuit and the R circuit is minimal. 

We assume that the number of channels per span is not 
upper bounded, i.e., the network is uncapacitated. 

The following table defines the input parameters to the ILP: 
N number of connections 
Sh source of connection h 
r/, destination of connection h 
S hl a binary indicator which is equal to 1 if connec- 
tions h and I have the same destination 
7 W a binary indicator which is equal to 1 if connec- 
tions h and I have the same source 
The variables used in the formulation are given below: 
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n binary variable which is 1 if and only if connec- 
tions h and I are protected together 
binary variable which is 1 if and only if connec- 
tion h uses link on the working path 

p^j binary variable which is 1 if and only if connec- 
tion h uses link on its S circuit 

q^j binary variable which is 1 if and only if connec- 
tion h uses link on its R circuit 

b^j binary variable which is 1 if connection h uses 
link on its backup path 

Pf 1 binary variable, which is 1 if and only if the S 
circuits for connections h and I share a node, j 
(required if n = 1). 

Q l j l binary variable, which is 1 if and only if the R 
circuits for connections h and I share a node, j 
(required if n hl = 1 and S hl = 0). 

P h binary variable, which is 1 if and only if con- 
nection h is protected with another connection 
that has a source different than that of h (this 
variable is important since if h is not protected 
with another such connection, there is no need 
for the S circuit). 

Q h binary variable, which is 1 if and only if con- 
nection h is protected with another connection 
that has a destination different than that of h 
(this variable is also important since if h is not 
protected with another such connection, there is 
no need for the R circuit). 

V^j binary variable which is 1 if and only if connec- 
tions h and I are protected together, and share link 
on the S circuit. 

Qij binary variable which is 1 if and only if connec- 
tions h and / are protected together, and share link 
on the R circuit. 

TT^j binary variable which is equal to 1 if connection 
h is the lowest numbered connection, among a 
number of jointly protected connections, to use 
link on its S circuit (used in computing the 
cost of the S circuit). 

6ij binary variable which is equal to 1 if connection 
h is the lowest numbered connection, among a 
number of jointly protected connections, to use 
link (i, j) on its R circuit (used in computing the 
cost of the R circuit). 

fiij binary variable which is equal to 1 if the sec- 
ondary protection path for connection j uses link 

(hj). 
Minimize: 

J2 (Zij+Ph + 0.57^ + 0.50^.) 

i.j,h 

In the above, the summation is the cost of the links used by 
the connections' working paths and the S and R circuits. It 
also includes the cost of a secondary circuit for 1+1 protection, 
in case network coding-based protection cannot be used. The 
calculation of these cost factors will be explained using the 
constraints below. 
Subject to: 



10 

The following constraints are enforced in the working and 
protection paths. 

I- Constraints on working paths: 

zl Sh =0 Vh, i ^ s h (34) 
Zr hJ = Vft, j + r h (35) 
XX,, = 1 VA (36) 

J2 z tr h =l Vh (37) 

E4 = E4 V/l > 3^s h , r h (38) 

i i 

4 + z% + z\ 3 + z\ t + n hl < 2 V/i, I, i, j (39) 

Equations ( f34l . (1361 . $35[ and d37l i ensure that the traffic on 
the working path is generated and consumed by the source and 
destination nodes, respectively. Equation d38l guarantees flow 
continuity on the working path. Equation (l39l ensures that the 
working paths of two connections which are protected together 
are link disjoint. Since a working path cannot use two links in 
opposite directions on the same span (or edge in the graph), 
then two connections which are protected together cannot use 
the same span either in the same, or opposite directions. Such 
a condition is included in Equation (139) . 

II- Constraints on secondary protection circuits: 



b\ Sh =0 V/i, i + s h (40) 

b h rh>j = Vh, j + r h (41) 

X < , 1 Vft (42) 

E^ =E 6 £ ^ ^s h: r h (44) 

i i 

i 

+ 4 < 1 V/i, i, j (46) 



The above constraints evaluate the cost of the secondary 
protection paths used for 1+1 protection. There are two sets 
of variables in the calculation of this cost. The first one is 
the fcj: variables, which are evaluated in Equations (I40t-<l44b 
using exactly the same way the z^ variables are evaluated. 
However, the cost that goes into the objective function depends 
on whether connection h is protected with another connection 
using network coding or not. The variables which evaluate 
this cost are the variables, and are evaluated in Equation 
(05), which makes it equal to b'^ only if the connection is 
not protected with another connection. Finally, Equation (06) 
makes sure that the working and the used secondary paths are 
link disjoint. 
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III- Constraints on P circuits: 



IV- Constraints on R circuits: 



P h > n hl - 7 M V/i, I 

i 
i 

E4=E4 *h,j 



<1 Vh,i,j 



z ij + 



+ n m <2 V7i,i,j 



(47) 
(48) 

(49) 

(50) 

(51) 

(52) 
(53) 

(54) 

(55) 



Equation ( |47| i ensures that the source of connection, h will 
be connected to a S circuit only if it is jointly protected with 
another connection, I. However, there is one exception to this 
case, which is the case in which the two connections h and I 
have the same source. In this case, the S circuit is not needed, 
and this is why 7™ is subtracted from the right hand side 
of the equation. Notice that if h is protected together with 
another connection that has a different source, then Equation 
(|47| | will then require that a S circuit be used. Equations (l48l > 
and d49l will ensure that traffic leaves and enters s/j using the 
S circuit, only if it is jointly protected with another connection 
that has a different source, i.e., when P h = 1. Equation ( |50| > 
guarantees connection h's flow continuity on the S circuit. 
Equation ( BTT i makes sure that the working path and its S 
circuit are link disjoint, while Equation (l52l makes sure that if 
two connections h and / are jointly protected, then the S circuit 
of I must also be disjoint from the working path of connection 
h. Notice that both of Equations ( Bit and d52l allow a S circuit 
to use two links in opposite directions on the same span, and 
this is why the sum of the corresponding link usage variables 
is divided by 2 in both equations. Equations (l53l . ( l54t and 
(EBT l make sure that if two connections, h and I, are protected 
together (n hl = 1), then their S circuits must have at least one 
joint node (P^ 1 = 1 for some j). However, similar to Equation 
(|47| >. a S circuit is not needed if the two connections have the 
same source, hence the subtraction of 7™ from the right hand 
side of Equation (l55l l. 

Notice that in the ILP formulation, the constraints imple- 
ment the S circuit as a set of paths, such that there is a path 
from each source back to itself. However, the requirement of 
at least one joint node between every pair of such paths as 
enforced by constraint (l55l l will make sure that the S circuit 
takes the form of a tree. 



Q h > n hl - 5 hl V/i, I 

i 
i 



* v + 2 



<1 Vh,i,j 



+ n hl <2 Vh,i,j 



E(9tf + 8tf)>2Qj' V/i, l,j 

i 

^2(q$ i + q l ji )>2Q? Vh,l,j 

i 

Qf > n hl - S hl Wh, I 



(56) 
(57) 

(58) 

(59) 

(60) 

(61) 
(62) 

(63) 

(64) 



Equations (T56b-d64b are similar to Equations d47b-(l55b. but 
they apply to the destinations and to the R circuit. Therefore, 
the variables P , -f hl , and P^ 1 are replaced by Q h , 5 hl , 
q^l and Q 1 * 1 , respectively. 



Constraints on joint protection: 



, 1 iu 



1 < n 



ll III 



V/i, L m 



(65) 



Equation ( |65] l makes sure that if connections h and I are 
protected together, and connections / and m are also protected 
together, then connections h and m are protected together. 
V- Constraints for cost evaluation: 



v hl < Pij_ 
' ij — 



yd < SiL 



- Pi 



■4* 



3 
1-1 



Vi,j,h,l 



■J 



h=l 

1-1 



> 



(66) 
(67) 
(68) 

(69) 



h=l 



Equations (l66l l. (|67] >, (l68l l and ( |69l are used to evaluate the 
cost of the S and R circuits, which are used in the objective 
function. Equation 



will make sure that Vk- cannot be 1 



unless connections h and I are protected together and share 
link ij on the S circuit. Equation ( |67| | will do the same thing 
for the R circuit. Note that both V^l and C&\ should be 
as large as possible since this will result in decreasing the 
cost of the S and R circuits, as shown in Equations (l68l l and 



In equation (1681 1. 7r^ for connection I will be equal to 
1 only if it is not protected on link ij with another lower 
indexed connection, and will be equal to otherwise. That 
is, it is the lowest numbered connection among a group of 
jointly protected connections that will contribute to the cost 
of the links shared by the S circuit. which is evaluated by 



Equation 
circuit. 



will also follow a similar rule, but for the R 
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TABLE IV 

Cost comparison between 1+1 and 1+N protectionfor 
NETWORKS WITH |V| = 6, \E\ = 9; \V\ = 8, \E\ = 12; AND 
\V\ = 10, \E\ = 20. 



W\,\E\ 


N 


1+1 


NPC 






Total 


Working 


Spare 


Total 


Working 


Spare 




4 


15 


6 


9 


14 


6 


8 


6, 9 


5 


17 


6 


11 


12 


6 


6 




4 


19 


9 


10 


16 8 8 


8, 12 


6 


26 


10 


16 


21 


11 


10 




4 


16 


6 


10 


12 


6 


6 


10,20 


6 


23 


10 


13 


19 


9 


10 



implementing this strategy, and to compare it to the cost of 
using 1+1 protection. It was shown that the use of NPC for 
self-healing has an advantage over 1+1 protection, in terms of 
the cost of connection and backup circuit provisioning. 



X. ILP Evaluation and Cost Comparison 

In this section results from the ILP formulation developed 
in the previous section to evaluate the cost of provisioning 
circuits to provide self-healing in autonomic networks using 
the proposed network protection codes. The ILP was solved 
using the Cplex linear programming solver [11]. We also 
compare the cost of provisioning NPC to that of provisioning 
1+1 protection. The cost of 1+1 protection is evaluated using 
Bhandari's algorithm [5]. 

We ran the ILP for various network topologies. The net- 
work topologies are generated randomly. First, we consider a 
bidirectional network with 6 nodes and 9 edges along with 
4 and 5 connections. Second, we consider a network with 8 
nodes and 12 edges along with 4 and 6 connections. Finally, 
we consider a network with 10 nodes and 20 edges, while 
provisioning 4 and 6 connections. 

The results shown in Table ( llVb indicate that the cost of 
provisioning self-healing using NPC is always lower than that 
using 1+1 protection, and the saving in the protection resources 
can reach up to 30%. strategy. For example, consider a network 
with 8 nodes, 12, and 6 connections. The total cost of using 
the 1+1 strategy is 26, while the total cost of using NPC is 
21. The total saving in resources in this case is close to 20%. 
However, the saving in the protection resources only is more 
than 30%. The advantage of using NPC over 1+1 protection 
may even improve further with the size of the network. For 
example, for the case of the network with 10 nodes, 20 edges, 
and 4 connections, the total cost of 1+1 protection is 16, while 
the total cost of NPC is 12, which means a total saving of 25%. 
The saving in the protection resources is also 40% in this case. 

XI. Conclusions 

We studied a model for recovering from network link 
failures using network coding. We defined the concept of 
network protection codes to protect against a single link 
failure, and then extended this concept and the techniques 
to protect against t link failures using network coding and 
reduced capacity. Such protection codes provide self-healing in 
autonomic networks with a reduced control and management 
plane complexity. We showed that the encoding and decoding 
processes are simple and can be done in a dynamic way. 
We also developed an ILP formulation to optimally provision 
communication sessions and the circuits needed to implement 
NPC. This formulation was then used to assess the cost of 
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